JupyterCon 2023 - Paris
import pandas as pd
from bokeh.sampledata.penguins import data as df
df.head()
| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | |
|---|---|---|---|---|---|---|---|
| 0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | MALE |
| 1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | FEMALE |
| 2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | FEMALE |
| 3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | FEMALE |
df.plot.scatter(x='bill_length_mm', y='bill_depth_mm')
<Axes: xlabel='bill_length_mm', ylabel='bill_depth_mm'>
.hvplot¶To enable .hvplot simply import hvplot.pandas.
import hvplot.pandas
df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm',
by='species'
)
df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm',
by='species',
hover_cols=['sex', 'island']
)
df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm',
by='species',
subplots=True, width=250
)
groupby¶df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm',
groupby=['species', 'sex'], width=500,
)
Objects returned by calls to .hvPlot are HoloViews objects.
scatter_bill_depth = df.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', width=300)
print(scatter_bill_depth)
:Scatter [bill_length_mm] (bill_depth_mm)
hist_bill_length = df.hvplot.hist('bill_length_mm', width=300)
HoloViews objects can be composed.
scatter_bill_depth + hist_bill_length
p1 = df.query('species == "Adelie"').hvplot.scatter(x='body_mass_g', y='bill_depth_mm', c='red', label='Adelie')
p2 = df.query('species == "Gentoo"').hvplot.scatter(x='body_mass_g', y='bill_depth_mm', c='blue', label='Gentoo')
p1 * p2
hist_bill_depth = df.hvplot.hist('bill_depth_mm', width=300)
import holoviews as hv
ls = hv.link_selections.instance()
ls(hist_bill_length + hist_bill_depth)
.hvplot() can handle displaying large data interactively thanks to Datashader.
flights = pd.read_parquet('airline_flights')
len(flights)
918205
flights.hvplot.scatter(x='distance', y='airtime', rasterize=True)
import panel as pn
import xarray as xr
ds = xr.tutorial.load_dataset('air_temperature')
air = ds.air
ds
<xarray.Dataset>
Dimensions: (lat: 25, time: 2920, lon: 53)
Coordinates:
* lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
* lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
air (time, lat, lon) float32 241.2 242.5 243.5 ... 296.5 296.2 295.7
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly....interactive¶To enable .interactive on an object, you need to make the corresponding import.
import hvplot.xarray
Here's an example of a very simple pipeline with 3 method calls chained. We want to replace the fixed value of 0 passed to .isel with a widget, and see how the output changes.
# static pipeline
air.isel(time=0).mean()
<xarray.DataArray 'air' ()>
array(274.16626, dtype=float32)
Coordinates:
time datetime64[ns] 2013-01-01We create a Panel IntSlider widget, that will produce values suitable to be passed to .isel.
slider = pn.widgets.IntSlider(name='time', start=0, end=10)
slider
We create the interactive pipeline by calling .interactive() on the data structure. The object it returns has the same API.
pipeline = air.interactive()
print(pipeline)
<hvplot.xarray.XArrayInteractive object at 0x2ba068b80>
We create the interactive pipeline, replacing 0 with the widget we've just created.
# interactive pipeline
pipeline.isel(time=slider).mean()
The output of the interactive pipeline includes the set of widgets that drive it, together with the pipeline output that is updated whenever a widget value changes.
While of course .hvPlot is well supported, an interactive pipeline can also output a Matplotlib plot generated from Xarray's .plot.
time = pn.widgets.Player(name='time', start=0, end=10, loop_policy='loop', interval=500)
air.interactive(loc='bottom', align='center').isel(time=time).plot()
air.interactive(loc='bottom', align='center').isel(time=time).hvplot()
temp_unit = pn.widgets.Select(name='Temperature unit', options={'K': 0, 'C': 273.15})
pipeline = (
air.interactive()
.isel(time=pn.widgets.IntRangeSlider) # Infer the values automatically
.to_dataframe() # Compute a Pandas DataFrame
.groupby('time')
.mean()
)
pipeline = pipeline - temp_unit # Support for operators
pipeline.hvplot.line('time', 'air', width=500)
from datetime import datetime
import yfinance
# Widget for the function as input
w_ticker = pn.widgets.Select(name='Ticker', options=['NVDA', 'AAPL', 'IBM', 'GOOG', 'MSFT'])
# Define a loading function that returns a Pandas DataFrame
def load_from_yahoo(ticker: str) -> pd.DataFrame:
return yfinance.download(
ticker, start=datetime(2020, 1, 3), end=datetime(2022, 4, 29), progress=False
)
# Bind the function to a widget and make the bound function interactive.
pipeline = hvplot.bind(load_from_yahoo, w_ticker).interactive(loc='left')
# Widgets for the regular pipeline
w_resample = pn.widgets.RadioButtonGroup(options=['W', 'M'])
w_dt_range = pn.widgets.DateRangeSlider(start=datetime(2020, 1, 3), end=datetime(2022, 4, 29))
(
pipeline
.loc[(pipeline.index>=w_dt_range.param.value_start) & (pipeline.index<=w_dt_range.param.value_end)]
.resample(w_resample).agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last'})
.hvplot.ohlc(grid=True, title=w_ticker, width=500)
)
diff = air.interactive.sel(time=pn.widgets.DiscreteSlider) - air.mean('time')
kind = pn.widgets.Select(options=['contourf', 'contour', 'image'], value='image')
interactive = diff.hvplot(cmap='RdBu_r', clim=(-20, 20), kind=kind)
template = pn.template.BootstrapTemplate(
title='Interactive pipeline',
sidebar=["## Select a time and type of plot", *interactive.widgets()],
main=[interactive.panel()],
)
template.show();
Launching server at http://localhost:55483
WARNING:bokeh.core.validation.check:W-1005 (FIXED_SIZING_MODE): 'fixed' sizing mode requires width and height to be set: Row(id='p5712', ...)
hvexplorer = hvplot.explorer(df)
hvexplorer
The explorer can be used to explore data and edit plots. Once you are satisfied with a plot, you can save its settings with settings() or get a string with plot_code() that you can copy/paste and execute to reproduce the plot.
plot_settings = hvexplorer.settings()
plot_settings
{'by': ['species'],
'kind': 'scatter',
'title': 'JupyterCon23',
'x': 'bill_length_mm',
'y': ['bill_depth_mm']}
df.hvplot(**plot_settings)
hvexplorer.plot_code()
"df.hvplot(by=['species'], kind='scatter', title='JupyterCon23', x='bill_length_mm', y=['bill_depth_mm'])"
df.hvplot(by=['species'], kind='scatter', title='JupyterCon23', x='bill_length_mm', y=['bill_depth_mm'])